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□ 1: L24123. Homo sapiens NRFl...[gi:438646] 

LOCUS HUMNRF1A 4992 bp mRNA 

DEFINITION Homo sapiens NRFl protein (NRF1) mRNA. 
ACCESSION L24123 
VERSION L24123.1 G!:438646 
KEYWORDS 

SOURCE Homo sapiens (human) 
ORGAN I SM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 4992) 
AUTHORS Chan, J, Y., Han,X.L. and Kan,Y.W. 

TITLE Cloning of Nrfl, an NF-E2-related transcription factor, by genetic 

selection in yeast 
JOURNAL Proc. Natl. Acad. Sci. U.S.A. 90 (23), 11371-11375 (1993) 
REDLINE 94068605 
PUBMED 8248256 
COMMENT Original source text: Homo sapiens cDNA to mRNA. 
FEATURES Loca t i on/Qua li f i ers 

source 1 . . 4992 

/organ isnrt="Homo sapiens" 
/mol_type="ntfm" 
/db_x r ef =" t axon : 9606" 
/eel l_l ine="K562" 
/eel l_type="erythroleukemia" 
polyA site 4992 

/gene="NRF1" 
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ggtatctgta cgtcttcgaa cctccgactt tcgttcttga ttaatgaaaa cattcttgge 
aaatgctttc gctctggtcc gtcttgcgcc ggtccaagaa tttcacctct agcggcgcaa 
tacgaatgcc cccggccgtc octet taatc atggeetcag tteegaaaae caacaaaata 
gaaccgcggt cetattecat tattcetagc tgcggtatcc aggcggetcg ggcctgcttt 
gaaeactcta attttttcaa agtaaacgtt cgggcecege gggacactea gctaagagca 
tcgagggggc gccgagaggc aaggcegagc tctaggccgg ccggcggtgg eggcggcgag 
gccgggactc gggcttaggg ectgctgtgg aggcagcggc ggaegcegag etaagcagtt 
tctctggaaa cccccetggt aagtgtggag gaggcgggac actctgaccc aagaegaaag 
gcetgtagct ecagccaaag aaaataaaec ttaggaggga gaaggaaaaa aaaatccatc 
agctgttcet gagaacagcc tgcattggaa tctaeagaga ggacaactaa tgtgagtgag 
gaagtgactg tatgtggact gtggagaaag taagtcacgt gggcccttga ggacctggac 
tgggttagga acagttgtac tttcagaggt gaggtgtcga gaagggaaag tgaatgtggt 
ctggagtgtg tccttggcct tggctccaea gggtgtgctt tcctctgggg cegtcaggga 
gctcatccct tgtgttetgc cagggtgggg tacggggttt gacactgagg agggtaacct 
gctggctgga gcggeagagc agtggccttg atttgtcttt tggaagattt taaaaaccaa 
aaagcataaa cattctggtc cttcagcaat gctttetctg aagaaatact taacggaagg 
acttctccag tteaccattc tgctgagttt gattggggta cgggtggacg tggatactta 
cctgacctca cagcttcccc eactceggga gatcatcctg gggcccagtt ctgcctatac 
tcagacccag ttccacaacc tgaggaatac cttggatggc tatggtatcc accccaagag 
catagacctg gacaattact tcactgcccg gcggctcctc agtcaggtga gggcectgga 
caggttccag gtgceaacea ctgaggtaaa tgcctggctg gttcaccgag acccagaggg 
gtctgtctct ggcagtcagc ccaactcagg ectcgecetc gagagttcca gtggectcca 
agatgtgaca ggcecagaca acggggtgcg agaaagcgaa acggagcagg gattcggtga 
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1381 agatttggag gatttggggg ctgtagcccc cccagtcagt ggagacttaa ccaaagagga 
1441 catagatctg attgacatcc tttggcgaca ggatattgat ctgggggctg ggcgtgaggt 
1501 ttttgactat agtcaccgcc agaaggagca ggatgtggag aaggagctgc gagatggagg 
1561 cgagcaggac acctgggcag gcgagggcgc ggaagctctg gcacggaacc tgctagtgga 
1621 tggagagact ggggagagct tccctgcaca gtttccagca gacatttcca gcataacaga 
1681 agcagtgcct agtgagagtg agccccctgc tcttcaaaac aacctcttgt ctcctcttct 
1741 gaccgggaca gag t caeca t ttgatttgga acagcagtgg caagatctca tgtccatcat 
1801 ggaaatgcag gccatggaag tgaacacatc agcaagtgaa atcctgtaca gtgcccctcc 
1861 tggagaccca ctgagcacca actacagcct tgcccccaac actcccatca atcagaatgt 
1921 cagcctgcat caggcgtccc tggggggctg cagccaggac ttcttactct tcagccccga 
1981 ggtggaaagc ctgcctgtgg ccagtagctc cacgctgctc ccgttggccc ccagcaattc 
2041 taccagcctc aactccacct tcggctccac caacctgaca gggctcttct ttccacccca 
2101 gctcaatggc acagccaatg acacagcagg cccagagctg cctgaccctt tggggggtct 
2161 gttagatgaa gctatgttgg atgagatcag ccttatggac ctggccattg aagaaggctt 
2221 taaccctgtg caggcctccc agctggagga ggaatttgac tctgactcag gcctttcctt 
2281 agactcgagc catagccctt cttccctaag cagctctgaa ggcagttctt cctcttcttc 
2341 ctcctcctct tcctcttctt cctctgcttc ttcctctgcc tcttcctcct tttctgagga 
2401 aggtgcggtt ggctacagct ctgactctga gaccctggat ctggaagagg ccgagggtgc 
2461 tgtgggctac cagcctgagt attccaagtt ctgccgcatg agctaccagg atccagctca 
2521 gctctcatgc ctgccctacc tggagcacgt gggccacaac cacacataca acatggcacc 
2581 cagtgccctg gactcagccg acctgccacc acccagtgcc ctcaagaaag gcagcaagga 
2641 gaagcaggct gacttcctgg acaagcagat gagccgggat gagcaccgag cccgagccat 
2701 gaagatccct ttcaccaatg acaaaatcat caacctgcct gtggaggagt tcaatgaact 
2761 gctgtccaaa taccagttga gtgaagccca gctgagcctc atccgagaca tccggcgccg 
2821 gggcaagaac aagatggcgg cgcagaactg ccgcaagcgc aagctggaca ccatcctgaa 
2881 tctggagcgt gatgtggagg acctgcagcg tgacaaagcc cggctgctgc gggagaaagt 
2941 ggagttcctg cgctccctgc gacagatgaa gcagaaggtc cagagcctgt accaggaggt 
3001 gtttgggcgg ctgcgagatg agaacggacg accctactcg cccagtcagt atgcgctcca 
3061 gtacgccggg gacggcagtg tcctcctcat cccccgcacg atggccgacc agcaggcccg 
3121 gcggcaggag aggaagccaa aggaccggag aaagtgagcc tggggaagaa gggggtttga 
3181 agcccaccaa gaccgaaact ggagaagggc tggacctgga cctggacctg gacctacagc 
3241 ggggacttaa atgccttctt atccaatata tcttctcaga tgggatgact gcgggtcagt 
3301 gtacaggaag aggcaggcac tggctggctc agctccactc gggtggagtg gaagtggcca 
3361 gaccatttag acggacaggg tcctcaccct acccctttcc tgtgaggcag gggtggtggt 
3421 ggagttgctg gaggtagagg agctatgtgg agcaaaggcc gacagagggg aaggaatgga 
3481 cctgtgagag gaagggaagg tggcagaaag tctcatttca ggaaggaggg atagaaggaa 
3541 ggaaggaagg aacccccccc cccccgaaaa aaaaatcaaa gcgggaagaa aatcagaggg 
3601 aaggttaagg ttggctctgg ccaggattcc aggcagcagg ttggagtgac tggtgggcct 
3661 agatcactgg tgtgataaac cccatttcac cccggggggg gtggggtaca cagacacagg 
3721 gtgggggtgg ggaggggcgg tgttaactct ttctgctcct tgcattttga catccctgaa 
3781 ggggagctct tggatatcat tggccatgtt tcaatcgaat ggagccactg ggccccaaca 
3841 ctggctttga gatttagagt caaagggtag agtgaacagg aaagggtcac gtggtcccat 
3901 gttgcaacag ccccaacata cgcatgtcat tcactgcctt gccactccat ctccctccgt 
3961 gctccagcca cccctgagct gaggctccca ttgtctccat cagagcctgc atgtgtatgc 
4021 cgtcctcccc tggtccggtg tttgtgttcc ccacccctca cagactgcct gagctcttct 
4081 gtaagctggg gtagggtgat ggcagtgctc cgggaactgg gcctgcagcc ttcctcttct 
4141 gggactgctg tgaggcagag gaatgatgga gaatctagtg tagcagcctc caggcaggat 
4201 tcagcacaac actggggagt cacccttccc tcgggcctct gcctaccaac aactgggctt 
4261 atcactggga aaacacaaaa aattacacaa cccagcaaca acaaaagaac tagtcctctt 
4321 agaatttctt gcgctttgat ttttttaggg cttgtgccct gtttcactta tagggtctag 
4381 aatgcttgtg ttgagtaaaa aggagatgcc caatattcaa agctgctaaa tgttctcttt 
4441 gccataaaga ctccgtgtaa ctgtgtgaac acttgggatt tttctcctct gtcccgaggt 
4501 cgtcgtctgc tttctttttt gggtttcttt ctagaagatt gagaagtgca tatgacaggc 
4561 tgagagcacc tccccaaaca cacaagctct cagccacagg cagcttctcc acagccccag 
4621 cttcgcacag gctcctggag ggctgcctgg gggaggcaga catgggagtg ccaaggtggc 
4681 cagatggttc caggactaca atgtctttat ttttaactgt ttgccactgc tgccctcacc 
4741 cctgcccggc tctggagtac cgtctgcccc agacaagtgg gagtgaaatg ggggtggggg 
4801 gaagcactga ttcccagtta gggggtgcct aactgagcag tagggataga aggtgtgaac 
4861 ctgggagtgc ttttataaat tattttcctt gtagatttta tttttaattt atctctgtga 
4921 cctgccaggg agaggggaga gagagagaga tgctgttgag cacatgacaa aataaaataa 
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BLASTP 2.2.6 [Apr-09-2003] 

RID: 1 067904 350-1 7957-366068 . BLASTQ3 
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Database: All non-redundant GenBank CDS 
t rans I at i ons+PDB+Sw i ssProt+P I R+PRF 
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Related Structures 

Sequences producing significant alignments: 
l542952|pirllA49672 



transcription factor Nrf1 - human >gi. 
l217486Q6|dbilBAC03440. II FLJ00380 protein [Homo sapiens] 
145053791 ref INP 003195.1! nuclear factor (erythroid-der i v. 
1319821731 reflNP 032712.21 nuclear factor, erythroid deri. 
l6831586lsplQ619851NFL1 MOUSE Nuclear factor erythroid 2 . 



Score E 

(bits) Value 

1210 0.0 

1200 0.0 

1195 0.0 

1166 0.0 

1161 0.0 
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i 1348734891 ref I XP 340887.11 similar to nuclear factor, ery... 



l3978250lQblAAC83235.1l Nrfl splice variant D [Mus musculus] 
1520471 lablAAA20466. 1 1 transcription factor LCR-Fl 
137590281 lablAAH59314.1 I Unknown (protein for MGC:68992) ... 
n2836061ldbi IBAB23483. 1 1 unnamed protein product [Mus mu. . . 
l5441517lemblCAB46813. 1 1 bZIP protein [Canis fami Maris] 
l37682103loblAAQ97978. 1 1 nuclear factor-like 1 [Danio rerio] 
l3108203lablAAC40108.l I nuclear factor erythroid-related ... 
121918831 1 ref INF 004280.31 nuclear factor (erythroid-der i . . . 
1 2091 2933 1 ref I XP 126805.11 similar to Nuclear factor eryt... 



l33525212lablAAH56142.1 I NFE2L3 protein [Homo sapiens] 
1 4521 225 1 db j I BAA76288. 1 1 NF-E2-related factor 3 [Homo sap... 
129351561 lablAAH49219. 1 1 NFE2L3 protein [Homo sapiens] 
1 6563268 lab I AAF1 7228.1 1 NFE2-related factor 1 [Homo sapiens] 
1375458741 ref I XP 017121,41 similar to nuclear factor (ery... 
167548341 ref I NP 035033.11 nuclear factor, erythroid der i v. . . 
1348558661 rfif I XP 231763.21 similar to Nrf3 [Rattus norveg. . . 
1 28277788 lab I AAH45852. 1 1 Similar to nuclear factor (eryth. . . 



l2137618lDi rl i 149261 p45 NF-E2 related factor 2 - mouse >. . . 
1 33504557 1 ref I NP 878309. 1 1 nuclear factor (erythroid-der i. . . 
1 6754832 1 ref I NP 035032. 1 1 nuclear, factor, erythroid der i. . . 
1139291181 ref INP 113977.1 1 NF-E2-reIated factor 2 [Rattus... 



I2134328lpi rl I 150224 erythroid eel I t ranscr ipt ion factor 
1 27695228 1 ob I AAH43997. 1 1 Similar to nuclear factor (eryth. . 



12136301 jpi r II 159340 transcription factor Nrf2 - human >g. . 
1 33469085 1 ref I NP 03271 1.11 nuclear factor, erythroid der i . . 
1209036731 ref iXP 128255.11 nuclear factor, erythroid der i . . 
1201495761 ref INP 006155.21 nuclear factor (erythroid-der i . . 
1 5453774 1 ref 1 NP 0061 54. 1 1 nuclear factor (erythroid-der i v. . 



1 10826401 pi r 1 1A54692 transcription factor NF-E2 45K chain... 
15068181ablAAA35612, 1 T leucine zipper protein 



1312002871 ref IXP 309091.1! 



1280770991 ref INP 778208.1 



1246492381 rfif INP 732834.11 



ENSANGP00000003712 [Anopheles ... 
nuclear factor (erythroid-der i . 
cap-n-col lar CG17894-PB [Droso. . . 
cap-n-Hcol lar CG17894-PA [Droso. .. 
cap-n-col lar CG17894-PC [Droso... 
cap 'n' collar isoform A [Drosoph... 
collar protein isoform C - fruit fly... 
1 3859887 1 ab 1 AAC72897. 1 1 cap ' n' collar isoform B [Drosoph... 



1246492401 ref INP 732835.11 



246492361 ref INP 732833. 1 1 



138598851ablAAC72896.1 1 



7511827lDirHT13936 



I103368lpi rl 1A33111 segmentation protein cnc - fruit fly 
l1352098lsplP2Q4821CNC DROME Segmentation protein cap'n'c. 



ll57074lQblAAB59246. 1 1 segmentation protein [Drosophila m. . . 
1348686871 ref IXP 345884.11 similar to transcription facto... 
134867531 I ref IXP 221712.21 similar to Bachi [Rattus norve. 
l2565400lablAAB84100. 1 1 transcription regulator protein [. . . 
145023531 ref INP 001177.1 1 BIB and CNC homology 1, basic I. 



166807641 ref INP 031546.1 I BTB and CNC homology 1 [Mus mus... 
776871 2 1 db i I BAA95505. 1 1 transcription regulator protein ... 
1 3540490 1 ref 1 NP 068585. 1 1 BTB and CNC homology 1, basic 



1 1 3898847 1 ob I AAK48898. 1 1 BACH2 transcription factor [Homo... 
1301 09320 labiAAH51 242. 1 1 Simi lar to BTB and CNC homology ... 
166716081 ref INP 031547.1 1 BTB and CNC homology 2 [Mus mus... 
1 2565402 1 gb 1 AAB84 101.11 Bachi protein homo log [Homo sapiens] 
1 34867221 I ref 1 XP 232858. 2 1 similar to BTB and CNC homolog. .. 



157391 32 lQblAAD50356. 1 1 Cap' n' col lar protein [Thermobia d. . , 
1251480681 ref INP 741404.11 the Binding Domain Of Skn-1 In... 
1251480721 ref INP 741406.11 the Binding Domain Of Skn-1 In.., 
1251480771 ref INP 741405.11 the Binding Domain Of Skn-1 In.., 
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l3318844|pdbllSKNIP Chain P, The Binding Domain Of Skn-1 
1 175650161 reflNP 503719.11 predicted CDS, the Binding Dom 
n5636685lahlAAL02138.1 I transcription factor AP-1 [Branc 



24974691 so I P7Q7031JUNB CYPCA TRANSCRIPTION FACTOR JUN-B 
29823878 iemh I CAD56858. 1 1 JunB protein [Takifugu rubripes] 



I2n8496lpirl 1 151606 gene c-jun protein - African clawed 
1 29823880 Iemh I CAD56859. 1 1 FJun protein [Takifugu rubripes] 
1 26353686 1 dh j 1BAC40473, 1 1 unnamed protein product [Mus mu 



52759lemblCAA31252. 1 I 



unnamed protein product [Mus muscu. . . 
166805121 ref INP 032442. 1 1 Jun-B oncogene [Mus muscu I us] >. . . 
l31419519lablAAH53234.1 I Unknown (protein for MGC: 64066) 



1 5650726 1 emb 1 CAB51 637. 1 I c-Jun protein [Xenopus laevis] 
1 31 339308 1 db i I BAC77Q44. 1 1 c-Jun protein [Carassius auratus] 
111778661 ref INP 068608.1 1 jun B proto-oncogene [Rattus n. . . 



transcription factor junB 
1 4504809 1 ref 1 NP 002220. 1 1 jun B proto-oncogene [Homo sapi... 



171Q348lQblAAA74916.1 



1 1 4495707 1 ob I AAH09465. 1 I Jun B proto-oncogene [Homo sapiens] 
29823874 1 emb I CAD56856, 1 I c-Jun protein [Takifugu rubripes] 



l3023298lsplP56432IAP1 PIG Transcription factor AP-1 (Act. 



1 225973 1 orf I 11 404381 A c-jun oncogene 



147586161 ref INP 002219. 1 1 v-jun avian sarcoma virus 17 on... _42 



l226129lorf I I1411298A c-jun gene 
167544021 ref INP 034721. 1 I Jun oncogene; activator protein. 
1 1 1 1 77864 1 ref I NP 068607, 1 1 v-jun sarcoma virus 17 oncogen. 



I68985lpirl ITVHUJN transcription factor AP-1 - human 
1 1 35295 1 sp I P1 88701 API CHICK TRANSCR I PT I ON FACTOR AP-1 (PR. . 
1 21 31 3434 1 ref INP 084356.11 RIKEN cDNA 1700012K17; androge. . 
1 1 2838749 1 db f I BAB2431 5. 1 I unnamed protein product [Mus mu. . 
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Al ignments 




n-> n \ 1542952 1 Pi r I IA49672 transcription factor Nrfl - human 
nlM4714932lnhlAAm0623.1 1 □ NFE2L1 protein [Homo sapiens] 
Length = 742 

Score = 1210 bits (3131), Expect = 0.0 

Identities = 652/740 (88%), Positives = 652/740 (88%) 

Query: 1 MLSLKKYLTEGLLQFTILLSLIGVRVDVDTYLTSQLPPLREI ILGPSSAYTQTQFHNLRN 60 

MLSLKKYLTEGLLOFT I LLSL I GVRVDVDTYLTSOLPPLRE 1 1 LGPSSAYTQTQFHNLRN 
Sb j c t : 1 MLSLKKYLTEGLLOFT I LLSL I GVRVDVDTYLTSOLPPLRE 1 1 LGPSSAYTQTQFHNLRN 60 

Query: 61 TLDGYG I HPKS I DLDNYFTARRLLSOVRALDRFOVPTTEVNAWLVHRDPEGSVSGSOPNS 120 

TLDGYG I HPKS I DLDNYFTARRLLSQVRALDRFQVPTTEVNAWLVHRDPE6SVSGSQPNS 
Sbjct: 61 TLDGYGIHPKSIDLDNYRARRLLSOVRALDRFOVPTTEVNAWLVHRDPEGSVSGSQPNS 120 

Query: 121 GLALESSSGLODVTGPDNGVRESETEQGFGEDLEDLGAVAPPVSGDLTKEDIDLIDILWR 180 

GLALESSSGLQDVTGPDNGVRESETEQ6FGEDLEDLGAVAPPVSGDLTKEDIDLIDILVIIR. 
Sbjct: 121 GLALESSSGLODVTGPDNGVRESETEQGFGEDLEDLGAVAPPVSGDLTKEDIDLIDILWR 180 

Query: 181 QD I DLGAGREVFDYSHROKEQDVEKELRDGGEQDTINAGEGAEALARNLLVDGETGESFPA 240 

OD I DLGAGREVFDYSHROKEQDVEKELRDGGEODTWAGEGAEALARNLLVDGETGESFPA 
Sbjct- 181 QDIDLGAGREVFDYSHRQKE0DVEKELRDGGEQDT1VAGEGAEALARNLLVDGETGESFPA 240 
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